Sync upstream NVIDIA/SkillSpector (2.2.3 → 2.3.7)#6
Merged
Conversation
LP3 fired on standard-compliant skills that declare capabilities via the Agent Skills `allowed-tools` field: _parse_manifest dropped allowed-tools, and the analyzer only read `permissions`, so a skill declaring `allowed-tools: [Bash, Read]` was reported as having no declared permissions. Parse and preserve `allowed-tools` (list or comma-separated string) in _parse_manifest, and treat a non-empty allowed-tools alongside `permissions` when evaluating LP3's "no declared permissions" condition. LP1/LP4 continue to use the explicit `permissions` list only. Add regression tests for allowed-tools parsing (both forms) and for LP3 no longer firing when allowed-tools is declared. Signed-off-by: CharmingGroot <70020572+CharmingGroot@users.noreply.github.com>
One exception anywhere in the arun_batches fan-out aborted the whole Stage 2 pass: asyncio.gather without return_exceptions cancelled the remaining batches, the meta-analyzer's blanket except caught the propagated error, and every file silently fell back to static-only results while the CLI still exited 0 (NVIDIA#9). arun_batches now isolates failures per batch: a transient error (timeout, 429, oversized-chunk 400) is logged and costs only its own batch. ValueError and NotImplementedError still propagate, since they signal misconfiguration rather than infra trouble. With partial results possible, apply_filter could no longer treat a missing confirmation as a rejection: a finding whose batch never returned would be silently dropped — a false negative manufactured by an infrastructure error (NVIDIA#11). The meta-analyzer now partitions findings by whether a returned batch actually carried them: analysed findings go through the normal confirm-or-drop filter, unanalysed ones are kept via the existing fallback path, and a WARNING logs how many findings were kept unfiltered so the gap is visible. Net effect: an infra failure can only ever cost enrichment on a file, never the finding itself, and one bad call no longer turns off the semantic filter for the whole scan. Fixes NVIDIA#9, fixes NVIDIA#11 Signed-off-by: nyxst4ck <289980115+nyxst4ck@users.noreply.github.com>
The MCP, semantic, and taint-tracking analyzers are implemented, but DEVELOPMENT.md still described them as stubs. Update the package-layout and "Stub analyzers" sections to reflect actual status (only mcp_rug_pull remains a stub), fix an invalid `//` comment in the Python example, and replace dangling internal "SADD" references in graph.py with neutral roadmap notes. Also refresh two stale "stub" docstrings. Docs and comments only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
Add a GitHub Actions workflow that runs on pull_request and push to main: - Lint and format checks with ruff - Unit tests with pytest (integration tests excluded to avoid LLM secrets) - Matrix over Python 3.12 and 3.13 on ubuntu-latest - DCO sign-off verification on PRs Windows is excluded because the test suite has known path-separator failures in build_context that are out of scope for this change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
CRITICAL and HIGH static findings are now always kept in the output of
apply_filter(), even when the LLM does not confirm them (omitted, denies the
finding, or returns confidence < 0.6). MEDIUM/LOW findings continue to be
filtered by the LLM for false-positive reduction as before.
Motivation: the LLM receives attacker-controlled skill content. A prompt-
injection payload embedded in a scanned skill could cause the LLM to drop a
real CRITICAL or HIGH static finding, hiding it from the security report
(a false negative in a security gate). This invariant applies to all providers.
Implementation:
- LLMMetaAnalyzer._HIGH_SEVERITY_FLOOR = frozenset({"CRITICAL", "HIGH"}).
- In the "not confirmed" branch of apply_filter(), if the finding's severity is
in the floor set, emit the original static finding unchanged and append the
tag "llm-unconfirmed" so consumers can distinguish it from LLM-validated
findings. The tag is not duplicated if already present.
- Confirmed CRITICAL/HIGH findings are still enriched with LLM explanation/
remediation/confidence as before (no regression).
- Finding.to_dict() now includes "tags", so the "llm-unconfirmed" marker is
visible in the JSON report (tags were previously not serialized).
Tests in tests/nodes/test_llm_analyzer_base.py:
- CRITICAL/HIGH finding unconfirmed -> kept, "llm-unconfirmed" tag, original data.
- MEDIUM/LOW finding unconfirmed -> still dropped (existing behaviour).
- Confirmed CRITICAL -> enriched normally, tag absent.
- Duplicate-tag guard -> "llm-unconfirmed" not appended twice.
- to_dict surfacing -> marker present in JSON output.
Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add an "Integrating SkillSpector" section (exit codes, JSON shape, severity/recommendation enums, and a recommended install-gate mapping) and a "Trust model and data egress" section (no skill execution; LLM sends file contents unless --no-llm; SC4 sends dependency names to OSV.dev by design). Cross-link from DEVELOPMENT.md. Documentation only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
…d_line apply_filter dropped LLM-confirmed findings whose static end_line is None when the model populated end_line (e.g. end_line == start_line, as DeepSeek does): the granular, start_only and coarse lookups all missed and the finding was silently filtered out. This turned a CRITICAL skill (live OSV CVEs) into SAFE once LLM analysis was enabled. Add an end_line-agnostic fallback keyed by (file, rule_id, start_line), gated on the static finding having end_line is None so findings deliberately distinguished by end_line keep exact matching. Add regression tests. Fixes NVIDIA#67 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Jiaying Huang <hjyxka@gmail.com>
build_context emitted component/file-cache paths using the OS-native separator (e.g. scripts\helper.py on Windows), which broke SARIF output and tests that expect POSIX-style paths. Normalize relative paths with Path.as_posix() so analysis output is identical across platforms. Apply the same �s_posix() normalization to the five test helpers that build file caches from fixture directories. The CLI terminal report also crashed on Windows consoles (cp1252) with UnicodeEncodeError when rendering box-drawing characters and icons. Reconfigure stdout/stderr to UTF-8 (errors="replace") at startup so the report renders without crashing on any platform. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
build_context emitted OS-native path separators for relative component paths, producing backslashes on Windows (e.g. `references\guide.md`). These paths are used as dict keys (components / file_cache / component_metadata) and as SARIF physicalLocation URIs, which are meant to be portable forward-slash paths — so on Windows the report locations were non-portable and downstream lookups (and the test suite) mismatched. Use Path.as_posix() so component paths are forward-slash on every OS, in build_context and the handful of test helpers that mirror the same relative_to logic. Behaviour-preserving on Linux/macOS (str() already yields forward slashes). Fixes NVIDIA#86 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
…API key
Add four agent-CLI LLM providers driven by locally-installed, already-authenticated
CLI binaries (claude, codex, gemini, agy) instead of metered HTTP endpoints:
SKILLSPECTOR_PROVIDER=claude_cli -> local `claude` OAuth session, no API key
SKILLSPECTOR_PROVIDER=codex_cli -> local `codex` session
SKILLSPECTOR_PROVIDER=gemini_cli -> local `gemini` session
SKILLSPECTOR_PROVIDER=antigravity_cli -> registered but DISABLED (fail-closed;
agy is TTY-only, uncapturable on a pipe)
Security chokepoint: all subprocess I/O goes through run_agent_cli() in
_agent_cli.py (shell=False, prompt via stdin only, capability-stripped argv,
env scrub, temp CWD, bounded streaming, fail-closed). _agent_cli_base.py
is the shared provider base class; the four concrete providers are ~5-line
subclasses.
No pinned model: CLI providers forward no --model by default so the user's
own configured model is used; SKILLSPECTOR_MODEL overrides. model_registry.yaml
files are absent; metadata methods return None and fall through to default
token budgets.
The AgentCLIChatModel adapter in llm_utils.py mimics the ChatOpenAI
interface (.invoke / .ainvoke / .with_structured_output) backed by the
provider's complete() subprocess transport, so existing LLM analyzers
(meta_analyzer, semantic_*) work with no code changes.
Rebased and adapted onto upstream provider refactor (a5092dd):
- providers/base.py: adds AgentCLICapable + has_cli_capability alongside
upstream's new ChatModelProvider / LLMProvider protocols.
- providers/__init__.py: registers CLI providers in _select_active_provider;
preserves upstream's create_chat_model / resolve_chat_model_credentials /
NO_LLM_API_KEY_MESSAGE / raise_no_llm_api_key_configured; CLI branch skips
create_chat_model (no HTTP transport) and calls raise_no_llm_api_key_configured.
- llm_utils.py: get_chat_model branches on has_cli_capability — returns
AgentCLIChatModel for CLI providers; delegates to providers.create_chat_model
(which uses upstream's native-client path, e.g. ChatAnthropic) for HTTP ones.
- Tests: merged upstream's ChatAnthropic/create_chat_model/NO_LLM_API_KEY_MESSAGE
coverage with PR#52's CLI dispatch/adapter/antigravity/no-pinned-model tests.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
Signed-off-by: sjh9714 <163989462+sjh9714@users.noreply.github.com>
MetaAnalyzerResult had a field_validator to repair `overall_assessment`
when an LLM returns it as a JSON string, but no equivalent for `findings`.
Some models/endpoints (e.g. Claude Sonnet via an OpenAI-compatible
gateway) return the top-level `findings` array as a JSON string, which
fails Pydantic list validation and crashes the scan with exit code 2
before any report is written.
Add a mirrored `field_validator("findings", mode="before")` that
json.loads a string and falls back to an empty list on parse failure or
non-list payloads, matching the existing `overall_assessment` handling.
Covered by new unit tests in TestMetaAnalyzerResultFindingsValidator.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
No existing analyzer flags skills that read the agent's own config directories, access MCP server config files, or enumerate other installed skills. All three vectors let a malicious skill discover API keys, tool definitions, and peer-skill prompts it has no legitimate need to see. Add static_patterns_agent_snooping with three rule IDs: AS1 – Agent config directory access (.claude/, .codex/, .gemini/) AS2 – MCP config file access (mcp.json / mcp_config.json) AS3 – Skill enumeration (listing or reading other skills' files) Register the new node in ANALYZER_NODE_IDS / ANALYZER_NODES (21 total), add AGENT_SNOOPING category and full AS1-AS3 entries to pattern_defaults, update the registry test, and add an integration test class covering true-positive and safe-content (false-positive) cases. Closes NVIDIA#75 Signed-off-by: Lalit Shrotriya <shrotriya.lalit@outlook.com>
…n Source) P2_PATTERNS was missing Unicode bidi override/embedding/isolate characters (U+202A-U+202E, U+2066-U+2069) that can be used to hide malicious instructions from human code review while the LLM sees and executes them. Add the range with confidence 0.85 (higher than plain zero-width chars because bidi controls have almost no legitimate use in AI skill content). Closes NVIDIA#39 Signed-off-by: Lalit Shrotriya <shrotriya.lalit@outlook.com>
…alidation LLMFinding and MetaAnalyzerFinding both hard-fail with le=1.0 when Ollama (or other local models) return confidence as an integer on a 0-100 scale. Add a mode="before" field_validator that: converts to float, divides by 100 if the value exceeds 1.0, then clamps to [0.0, 1.0]. This also handles negative values and values above 100 gracefully rather than crashing the meta-analyzer for the entire file. Closes NVIDIA#89 Signed-off-by: Lalit Shrotriya <shrotriya.lalit@outlook.com>
…etection Absolute edit-distance <= 2 produces false positives on short, legitimate package names: "task" is a real package and is edit-distance 2 from "flask", yet was flagged as a possible typosquat (one of the cases reported in NVIDIA#103). Add a relative-distance guard (dist/shorter_len <= 1/3) so short names need an all-but-one-character match while longer names may still differ by two characters. Existing behaviour is preserved: - "reqeusts" -> "requests" (len 8, dist 2) still flagged - "expreess" -> "express" (dist 1) still flagged - "task" -> "flask" (len 4, dist 2) no longer flagged Adds a regression test; all existing SC4/SC5/SC6 unit tests still pass and ruff is clean. Refs NVIDIA#103 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Yonatan Gross <yonatan2gross@gmail.com>
`_TP3_MALICIOUS_URL_RE` exempted any default URL whose host merely *starts with* "localhost"/"127.0.0.1": the negative lookahead matched the bare substring with no host boundary. An attacker-controlled host such as `http://localhost.evil.com/exfil` was therefore treated as loopback and skipped, silently bypassing the malicious-default-URL detection in TP3 — an attacker only needs to register a host beginning with `localhost.`. Anchor the exemption with `(?:[:/?#]|$)` so only genuine loopback hosts (optionally followed by a port/path/query/fragment) are exempt. External hosts that share the prefix are now flagged; real loopback defaults (`http://localhost:8080/...`, `http://127.0.0.1/...`) stay exempt. Adds regression tests for localhost-/127.0.0.1-prefixed attacker URLs and a guard that genuine loopback defaults remain exempt. Signed-off-by: jichao wang <jichaowang02@gmail.com>
behavioral_ast and behavioral_taint_tracking resolved call names purely
syntactically, so dangerous calls imported under an alias were missed:
from os import system # system("id") -> AST5 missed
import os as o # o.system("id") -> AST5 missed
from subprocess import run # run([...]) -> AST4 missed
import subprocess as sp # sp.run(secret) -> AST4 / TT missed
A skill could evade detection simply by importing the primitive under
another name. This reuses the existing import-alias scan
(_build_import_aliases), exposes it as build_import_aliases(), adds an
apply_import_aliases() helper, and threads an optional `aliases` argument
through resolve_call_name() and resolve_call_name_typed(). Aliases are
normalized before the type-map lookup so already-canonical names are not
re-expanded (e.g. `from socket import socket` must not yield
socket.socket.socket.recv).
Adds TestImportAliasEvasion coverage to both analyzers: aliased dangerous
forms are now detected, while aliased-but-safe imports produce no findings.
Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
…patibility jsonschema-rs (v0.29.1) fails to build on Python 3.14 because the bundled PyO3 (v0.23.4) does not support Python 3.14. Adjust the requires-python range from <3.15 to <3.14 and remove the 3.14 classifier to accurately reflect supported versions. Fixes NVIDIA#111 Signed-off-by: Perseus Computing <51974392+tcconnally@users.noreply.github.com>
Three changes to improve SC4 (Known Vulnerable Dependencies) reliability: 1. Configurable timeout: Read SKILLSPECTOR_OSV_TIMEOUT env var (default 30s, was hardcoded 10s) so users in high-latency environments can increase it. 2. Increased default timeouts: Raised query timeout from 10s to 30s and is_available() check from 5s to 15s to reduce silent fallback rate. 3. Visible fallback warning: When OSV.dev is unreachable and static fallback finds nothing, emit a LOW-severity SC4 finding alerting users that results may be incomplete. Previously the fallback was only visible in --verbose logs. 4. Distinguish clean packages from failed lookups: Added INFO log when OSV.dev returns no vulnerabilities for a package (was silently cached). Added was_osv_reachable() helper so callers can detect API failures vs clean results. Fixes NVIDIA#102 Signed-off-by: Perseus Computing <51974392+tcconnally@users.noreply.github.com>
… == and <= When requirements.txt uses >=, >, ~=, or != operators, the version was set to None before querying OSV.dev, causing all historical CVEs to be reported. This happened because _extract_packages_from_requirements() only forwarded the version for == and <= operators. Fix: pass the bound version for all operators. OSV.dev uses the version to filter CVEs, so passing e.g. "1.26.0" for "numpy>=1.26.0" returns only CVEs affecting that version, not all historical advisories. Fixes NVIDIA#43 Signed-off-by: Perseus Computing <51974392+tcconnally@users.noreply.github.com>
Add [tool.uv] section to pyproject.toml and document the uv tool install quick-start command in the README. This allows users to install SkillSpector with a single command: uv tool install git+https://github.com/NVIDIA/skillspector.git Updates are equally simple: uv tool update skillspector Fixes NVIDIA#99 Signed-off-by: Perseus Computing <51974392+tcconnally@users.noreply.github.com>
Adds Perseus and Mimir to the _POPULAR_PYPI set used by SC6 typosquat detection. These are established MCP ecosystem packages (>1K downloads) that should not be flagged as potential typosquats. Including them also helps protect the package names from typosquatting confusion. Signed-off-by: Perseus Computing <51974392+tcconnally@users.noreply.github.com>
…dict endpoints Add a new `anthropic_proxy` provider that enables SkillSpector to connect to Anthropic Claude models served behind Vertex-style raw-predict endpoints (corporate API gateways, GCP Vertex AI, self-hosted proxies). These endpoints differ from api.anthropic.com in URL structure (model in path), auth mechanism (Bearer token), and body format (anthropic_version in body, no model field). The provider uses custom httpx transports to rewrite ChatAnthropic SDK requests at the HTTP layer, preserving all LangChain features (tool calling, structured output, streaming). Configuration: ANTHROPIC_PROXY_ENDPOINT_URL — full endpoint URL ANTHROPIC_PROXY_API_KEY — Bearer token ANTHROPIC_PROXY_API_VERSION — optional (default: vertex-2023-10-16) Closes NVIDIA#127 Signed-off-by: Alen Joses R <alr@redhat.com> Signed-off-by: alenjoses <alr@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Addresses review feedback on NVIDIA#52 (reporter: @486). The claude provider requested `--output-format json`, whose envelope shape is not a stable contract across Claude Code builds (single object in 2.1.177, JSON array of stream events in 2.1.178). `_parse_claude_output` only handled the dict shape, so on newer builds it raised "expected a JSON object from claude, got list" and every Stage-2 semantic analyzer silently degraded to static-only. Switch to `--output-format text`, the stable headless contract: stdout is the assistant's reply with no envelope, unchanged since `-p` was introduced. The structured output the analyzers need is produced by the model (schema appended to the prompt) and validated by Pydantic in the adapter layer, so it is unaffected by the transport change. `_parse_claude_output` collapses to strip-and-return (raise on empty). Also dedup: `_parse_agy_output` was a verbatim copy of `_parse_gemini_output` and dead code (since `_build_agy_argv` fails closed). Point agy's registry entry at `_parse_gemini_output` (agy's backend is Gemini). Re-verified agy 1.0.10 still cannot be driven via a pipe (hangs, empty stdout) — the disabled status remains correct. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
Addresses the second review point on NVIDIA#52 (reporter: @486): when use_llm is requested but every LLM-backed analyzer fails at runtime (transport/parse/ auth error), each node swallows its exception and returns no findings while the report still claimed llm_available=true -- so a deep scan silently became static-only with no visible signal. - Add SkillspectorState['llm_call_log'] (additive list, same reducer as findings) plus the llm_call_record() helper. Each LLM-backed node (the 3 semantic analyzers + meta_analyzer) appends one record per run: ok=True on success, ok=False with the error on the fallback paths. Intentional skips (use_llm=False, empty inputs) record nothing, so they are never mistaken for failures. - report._build_metadata computes a "degraded" state (use_llm requested, >=1 call attempted, 0 succeeded) and then sets llm_available=false, adds llm_degraded=true and llm_calls_attempted/succeeded, and an llm_error explaining the static-only fallback with deduped reasons. - Visible warning in terminal and markdown output; a single aggregate warning is logged in report() regardless of output format. Tests: degradation aggregation across output formats (test_report.py) and per-node telemetry emission (success / failure / intentional-skip) for the semantic analyzers. No net-new failures vs the pre-existing suite baseline. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
…d records) Hardening pass on the degradation signal so no LLM call site or output format can hide a degraded scan: - Instrument mcp_tool_poisoning's TP4 (the 5th LLM-backed call site, previously uninstrumented). _check_tp4 now returns (findings, record): ok=True/False only when an LLM call was actually attempted, None for the no-description / no-code early-outs so an intentional no-op is never counted as a degraded stage. - Surface degradation in the default SARIF output via the standard invocations[].toolExecutionNotifications (warning level); add SarifInvocation / SarifNotification models. executionSuccessful stays true (the scan completed; only the LLM sub-stage degraded). Healthy scans emit no invocations block, so existing SARIF output is unchanged. - Replace the untyped dict telemetry record with an LLMCallRecord TypedDict (node/ok/error), matching the repo's TypedDict-based state convention. Tests: SARIF degradation notification + schema validity and the unchanged healthy case; TP4 telemetry (success / failure / no-call / use_llm=false). No net-new failures vs the pre-existing suite baseline; ruff + format clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
feat: add AWS Bedrock provider for Claude via SigV4
Add DEFAULT_MODEL and SLOT_DEFAULTS as ClassVar declarations to ModelMetadataProvider Protocol so callers no longer need type: ignore[attr-defined] when accessing provider attributes. Add validate_base_url() helper that warns on malformed (non-http/https or missing host) base URLs at model creation time, catching operator misconfigurations early without raising. Remove stale type: ignore[attr-defined] from constants.py. Signed-off-by: mimran-khan <mohammed_imran.khan@outlook.com> (cherry picked from commit d0006e1)
Signed-off-by: Mohit Gupta <269879782+mohgupta-ship-it@users.noreply.github.com> (cherry picked from commit 3490e1a)
Signed-off-by: Mohit Gupta <269879782+mohgupta-ship-it@users.noreply.github.com> (cherry picked from commit 8fb192b)
…179-157 fix: address non-blocking reviewer nits from NVIDIA#178, NVIDIA#179, and NVIDIA#157
…e-exfiltration feat(analyzer): detect cloud-storage exfiltration as E5
docs(mcp): surface install extra in quick-start and add stdio caveat
…scape feat(analyzer): detect privileged container execution and escape primitives as PE5
…odel docs(mcp): document HTTP transport trust model
Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>
…ider Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com> # Conflicts: # README.md # src/skillspector/providers/__init__.py # tests/nodes/test_meta_analyzer.py # tests/nodes/test_report.py
Signed-off-by: Steven Moy <github@stevenmoy.com>
Signed-off-by: Rod Boev <rod.boev@gmail.com>
Signed-off-by: Rod Boev <rod.boev@gmail.com>
Signed-off-by: Rod Boev <rod.boev@gmail.com>
…SON formats Signed-off-by: Rod Boev <rod.boev@gmail.com>
…bsent Signed-off-by: Rod Boev <rod.boev@gmail.com>
…ged-workload feat(analyzer): detect privileged Kubernetes workload deployment as TM4
Support Python 3.14
…utput-203 fix(cli): write concatenated multi-skill report to --output for non-JSON formats
fix(input): support scp-style SSH Git URLs in host validation
feat(ossf-scorecard): add ossf-scorecard github action integration
…-field fix(mcp): treat allowed-tools as a permission declaration for LP3
feat(providers): local agent-CLI providers (claude/codex/gemini), no API key
docs: correct stale analyzer status and dangling references
smoy
approved these changes
Jun 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Syncs
ExaForce/SkillSpectorwith upstreamNVIDIA/SkillSpector@main(a5092dd..5df93c5). This is a clean merge — our fork's only divergence from upstream is the standalonebenchmark/project, which upstream does not touch, so there were no conflicts.Upstream version bumps from 2.2.3 → 2.3.7 and brings ~124 files / +16.5k lines, including:
scan_skill; HTTP transport trust-model docs.analysis_completenessfield, SARIF rules[] array, report sanitizer (strip ANSI/control bytes), fail-closed on degraded deep scans, per-rule diminishing returns to prevent score saturation.--recursivemulti-skill scanning, Python 3.14 support, OSSF Scorecard GH action, Windows path/encoding fixes,uv tool installsupport.Details
d3fac00—Merge remote-tracking branch 'upstream/main'benchmark/directory confirmed untouched by the merge.pyproject.toml: version2.2.3 → 2.3.7; addedlangchain-aws,boto3, optionalmcpextra. Benchmark's editable dep onskillspectorremains intact.🤖 Generated with Claude Code